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1 Advanced Computer Architecture 


(a) For an out-of-order superscalar processor, what are false dependencies on register 
names and what hardware technique is often used to remove them? [4 marks] 


(b) Why do out-of-order superscalar processors use a store queue (sometimes called 
a store buffer) whereas simple scalar processors do not? [4 marks] 


(c) For an out-of-order superscalar processor, what are the trade-offs between using 
a reorder buffer and a unified register file to hold computed register values? 
[4 marks] 


(d) Consider the following C-code implementation of bubble sort: 


void bubbleSort(int array[], int size) { 
for (int step = 0; step < size - 1; ++step) { 
for (int i= 0; 4 < size: =-step.- 13 +#1) -{ 
if (array[i] > array[i+ 1]) { 
int temp = array[il]; 
arrayl[i] = array[i + 1]; 
array[i + 1] = temp; 


} 


(1) The 32-bit ARM ISA allows almost all instructions to be conditional (or 
predicated) whereas the newer 64-bit ARM ISA does not. Using the above 
code as an example, how could predicated execution be used to avoid 
data-dependent branch misprediction using if-conversion? [4 marks] 


(71) Why do modern out-of-order superscalar processors avoid predicated 
execution? Why might the above code result in a lot of branch 
mispredictions? [4 marks] 


CST2.2024.8.3 


2 Bioinformatics 


(a) Discuss how to use bioinformatics algorithms to detect the specific pathogenic 


sequences in the genome of a pathogenic species, by comparing its genome with 
the genome of an evolutionarily-close non-pathogenic species. 


Hint. Most pathogenic bacteria have long DNA sequences containing disease- 
causing genes that are not present in the genome of similar non-pathogenic 
species. Consider how to detect extra material, and perhaps inverted repeats 
(which are usually formed during the insertion of the disease-causing genes.) 

[5 marks] 


Compute the global alignment and the best score of the sequences {CGTGT, 
TGGCGCC} with the following parameters: match score = +2, mismatch score 
= —1, gap penalty = —2. Report the final score and alignment(s). [4 marks] 
Dimerisation occurs when two similar proteins (P) join together to form a dimer 
(D), and dissociation reverses this process. The Gillespie algorithm may be 
applied to model dimerisation and dissociation of proteins, with species P and 
D, and the following rate constants: 
e Dimerisation: 2P > D with rate c; 


e Dissociation: D > 2P with rate co 


Dimerisation is rare and dimers are unstable, therefore cp >> c,. Explain how 
to use the Gillespie algorithm to model dimerisation and dissociation reactions. 


[7 marks] 


(d) Discuss advantages and disadvantages of using DNA to store information. 


[4 marks] 
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3 Cryptography 


(a) 


(b) 


YottaVPN, your employer’s main network-encryption product, generates a 
master key K €r {0,1}!%8 and an initial seed Ro €r {0,1}*° randomly once, 
when the product is installed. It then uses 


Algorithm (A): R; = Encg (Ri_1) for i> 0 


to generate a stream R,, Ro,... of session keys for encrypting individual network 
connections. That algorithm then runs continuously throughout the lifetime of 
the product. Your colleague suggests to replace (A) with 


Algorithm (B): R,=Encn(R;i-1) 6 Ri, fori >0 


because they feel that would be more secure. [Enc is a government-approved 
blockcipher with 80-bit blocksize and ©@ is bit-wise exclusive-or.| 


(i) For each of algorithm (A) and (B), averaged over all (X, Ro), what is the 
expected number of different session keys |{R1, Ro,...}| that they will be 
able to generate from one (K, Rg)? State your assumptions. [5 marks] 


(i7) What is the smallest number of different values |{R1, Ro,...}| that could 
be generated by (A) and (B) from any fixed pair (K, Ro)? [2 marks] 


(ii7) Suggest another deterministic key-derivation algorithm (C), using the same 
blockcipher, 80-bit state and fixed parameters (A, Ro), that maximises 
tas Hotes |) [2 marks] 


(iv) Years later, a worried user discovers that, due to an operator error, the state 
(KX, Res535) of their YottaVPN installation was accidentally committed to 
a publicly accessible Git repository. Compare which other values R; were 
compromised by this leak, if either algorithm (A), (B), or (C) had been 
used. [6 marks] 


(v) Name a security benefit that could be claimed for algorithm (B) compared 
to (A). [1 mark] 


Your colleagues designed a scheme that encrypts messages M; € {0,1}° with 
one-time pads R; €r {0,1}* into ciphertexts C; = M; @ R;. But to help 
estimate the frequency of transmission errors when transferring the R;, they 
decided to occasionally replace the last random bit of any R; with a “parity” 
bit, with a probability of 0.01. As a result, the probability of any R; containing 
an even number of one bits is 0.505. Does this encryption scheme offer 
indistinguishability in the presence of an eavesdropper? Explain your answer. 
[4 marks] 
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4 Denotational Semantics 


In all parts of this question, you are allowed to use theorems from the course, provided 
you state them precisely beforehand. 


Define the smash product D ® E of two domains D and E to be the set 


{(tz,y)€ Dx E|ct@AlpAyFte}U{Lper} 


where |pgs is a new element which is not a pair. This set is equipped with the 
order Epger such that Lpge Ener z for any z, and (x,y) Eper (2, y’) if and only 
tplepe andy ery: 


(a) 
(b) 


Show that the smash product of two domains is a domain. [4 marks] 


Given three domains D, E and F’, we call a function f € Dx E > F bistrict if 
for any x € Dand y€ F, f(l,y) = 1 and f(a, L) = L. 


Show that not all strict functions are bistrict. [3 marks] 


Let D, E and F be domains, and f : D x E > F a function. Give a condition 
on the currying cur(f) : D > (E > F) of f that is necessary and sufficient for 
f to be bistrict. [4 marks] 


We define the function smash as follows: 


smash: Dx E — D®E 
(x,y) > (ay) ifeALandyA#L 
(x,y) t+ Lopez otherwise 


Show that if f: Dx E — F'is continuous and bistrict, then there exists a unique 
f: D®E-— F that is strict and continuous and such that f = f osmash. 
[4 marks] 


Give the definition of X,, the flat domain on a set X. [1 mark] 
Given two sets S and T, show that the domains (S x T), and S, @ 7, are 
isomorphic, i.e. that there exist strict continuous functions f: (S x T), — 


S, @T, andg: S$; ®T, > (Sx T), such that fog = id and go f =id. 
[4 marks] 
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5 E-Commerce 


(a) When preparing to internationalise an E-Commerce business describe four 


(b) 


factors that you need to consider and why. [4 marks] 


Some commentators think that E-Commerce companies should not be considered 
a separate category of company. Do you think that E-Commerce companies 
are fundamentally different to traditional companies? Using examples and 
referencing economic frameworks, give a reasoned argument including points 
for and against the proposition. [16 marks} 
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6 Hoare Logic and Model Checking 


Consider a programming language with commands C’ consisting of the skip no-op 
command, sequential composition C;;Cz, loops while B do C’ for Boolean expres- 
sions B, conditionals if B then C, else Cy, assignment X := EF for program 
variables X and arithmetic expressions F, heap allocation X := alloc(F,,...,En), 
heap assignment [FE] := E>, heap dereference X := [LE], and heap location 
disposal dispose(F). Assume null = 0, and predicates for lists and partial lists: 


list(t, |]) = (t = null) A emp 
list(t,h :: a) = dy.(tH h) « ((t+1) 6 y) « list(y, a) 
plist (t,, |], t2) = (t: = te) A emp 


plist(t),h :: a,te) = dy. (ti Hh) * ((t1 +1) & y) x plist(y, a, te) 


In the following, all triples are linear separation logic triples. No proofs are required. 


(a) Precisely describe a stack and a heap that satisfy X +H Y*Y WH X. Givea 
(non-looping) command C’ that satisfies the following triple. 
{emp} C{X HY «Yrs xX}. [3 marks] 


(b) Define and explain a partial correctness rule for a new command unseq(C,,C2), 
which executes commands C; and C in either order (C1; C2 or Cp;C,). Maintain 
soundness of the proof system, and ensure the rule accurately reflects the 
behaviour of the new command. [3 marks] 


(c) Do the same for a new command add_to(F,, £2). If expressions FE, and E» 
evaluate to allocated, disjoint memory locations, it increments the value stored 
at the first location by the value stored at the second. Otherwise it crashes. 

[3 marks] 


For each of the following triples, give a loop invariant that would prove it. 


(d) This command duplicates each list element. As per precondition assume Y is 
initially the head _X; assume dup duplicates elements, e.g. dup [1,2] = [1, 1, 2, 2]. 
{list(X,a) AY = X} 
while YAnull do (V:=[Y]; N:=[Y+1]; D:=alloc(V,N); [Y+1]:=D; Y:=N) 
{list(X,dup a)} [4 marks] 


(e) This command removes all negative numbers in a list, assuming it starts with 0. 
{list(X, [0]++a) } 
L:=X; Y:=[X+1]; 
while YAnull do ( 
V:=LY]; N:=[y+1]; 
(if V<O (dispose(Y); dispose(Y+1)) else ([L+1]:=Y; L:=Y)); Y:=N 
Ye: [bed e=nu bl 
{list(X, [0]++(remove_negatives a))} [7 marks] 
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7 Information Theory 


Consider a set of coins, identical in appearance, but where some unknown subset is 
heavier than the others. You have a balance scale with two pans that can be used to 
tell whether the contents of one pan are heavier, the same or lighter than the other. 


Each weighing of coin subsets can be represented graphically as per example below, 
which identifies the coins on the left pan (L), right pan (R) and those put aside (P). 
A series of weighings can be represented by a tree of such nodes. 


Left 
heavier 


L: 1,2 
R: 3, 4 Balanced 
Ps5 
Right ~ 
heavier “4 


(a) Define Discrete Entropy mathematically and conceptually and explain how it 


can be applied in weighing problems to reduce the overall number of weighings 
required to find the heavy coins. How would you expect this to compare to 
a naive strategy of evenly partitioning the heavier set of coins on the next 
weighing? [5 marks] 


If the set contains six coins of which one is heavy: 


(1) Draw the weighings tree for the naive binary partitioning strategy and 
compute the average number of weighings required. 3 marks 


(ii) Draw the weighings tree for the Entropy-based strategy and compute the 
average number of weighings required. 3 marks 


(iit) Reconcile your answer to (a) with your answers to (b)(<), (b)(i). 


2 marks 


If the set contains six coins, two of which are heavy, draw an Entropy-based 
weighing strategy. You should assume the two heavy coins have the same 
weight as each other. Explain your answer and compute the average number of 
weighings needed. [7 marks] 
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8 Machine Learning and Bayesian Inference 


Suppose a Bayesian network has the form of a chain: a sequence of Boolean random 
variables X,,...X, where Parents(X;) = {X;_1} for 7 =2,...n. 


X1 > X_ 9 X39 +--+ 9 Xp (1) 


Derive an expression for the probability Pr(X, = x,|X, = True). You may 
neglect the normalising factor. [2 marks] 


Derive the time complexity for computing Pr(X, = x,|X, = True) using variable 
elimination. Contrast against exact inference without memoization. [6 marks] 


State the E and M steps in the expectation-maximisation algorithm for 
parameter estimation in a problem involving latent variables Z and observed 
data X. [4 marks] 
Henceforth, let 9 denote the parameters to be estimated and X denote data we 
observe. Justify why the EM algorithm locally maximises p(X|#) with respect 
to 0. [3 marks] 
Consider the Bayesian network depicted in the figure below. {X;}%_, denote 
observed variables while the {Z;}"_, are unobserved. We place the following 
distributions on the random variables: 

e Z, and Z;|Z;_, =1 are Bernoulli-distributed, where | € {0, 1}. 

e X,|Z; =/ is a univariate normal distribution, where / € {0, 1}. 
Collectively denote the parameters of the above distributions as 0. Give the 


factorisation of the joint probability indicated by the Bayesian network structure 
in terms of the given distributions. [5 marks] 
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9 Optimising Compilers 


The following function in C-style code is optimised by a compiler. Assume that 
variable argO is the argument to the function and so has already been defined. 


x = arg0; 
y= x * 2; 
Z=x* 4; 


while (true) { 
if Ge 6.=3.0)4 


yszzti; 
print (y); 
break; 
t 
ka xe 1 
t 
y = arg0; 
print (y) ; 
What is the live range of a variable? [2 marks] 


User variables are assumed to reside in the same virtual register in the 
intermediate representation throughout the entire program. How can static 


single assignment (SSA) form help reduce their live ranges? [2 marks] 
Put the code above into SSA form. [4 marks] 
Describe and give the dataflow equation for live variable analysis. [4 marks] 


Perform live variable analysis on the original code at the beginning of the 
question and use it to perform dead-code elimination, showing the in-live sets 
after the analysis. [4 marks] 


Describe how dataflow analyses, such as live variable analysis, could be simplified 
if the code was in SSA form. [4 marks] 
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10 Principles of Communications 


(a) 


Fibbing is a technique for adding custom forwarding information base entries 
in a routing domain. A controller masquerades as a router, which injects more 
specific destinations and shorter paths by some metric, than the ones discovered 
by the regular link state routing algorithm. Typically, the goal is to support a 
traffic engineering policy for some destination or source, for example for lower 
latency, or higher capacity. 


A consortium of Internet Service Providers propose to use the same idea 
for Inter-Domain routing, by announcing specialised paths using the Border 
Gateway Protocol (BGP). They have heard of path-prepending as a technique 
to influence inbound traffic from neighboring Autonomous Systems (ASs). Of 
course, they can use local preferences for outbound traffic, so that doesn’t need 
external influence. 


(1) How can we use the same idea as fibbing to inject BGP announcements 
that will influence inbound traffic, so as to create different paths for different 
destinations within this AS? Your answer should address the challenge that 
BGP is path-vector, not link-state, and that there are local filtering policies 
that may interfere with path advertisements. [10 marks} 


(it) What is the potential security problem (think about trust)? [5 marks] 
Imagine we wished to optimise a network for reliability, rather than say delay. 
Consider the approach of minimising delay by iteratively moving a portion of 


traffic from one path to others. How might this approach be adapted to provide 
reliability. [5 marks] 
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11 Quantum Computing 


(a) In which quantum and classical computational complexity classes is factoring? 


(b) 


[2 marks] 


Shor’s algorithm is used to factor N = 21. Shor’s algorithm requires a positive 
integer x, which is greater than one and less than N, to be chosen at random. 


(1) What property must x have for Shor’s algorithm not to terminate early? If 
x = 14 is chosen, when does Shor’s algorithm terminate? 3 marks 


(i7) Instead x = 4 is chosen. What is the order of 4 modulo 21? 3 marks 


(27) If Shor’s algorithm is run with x = 4 explain what happens. Is the correct 
answer returned? 3 marks 


Consider a three-state quantum automaton with initial state |0) and a single 
accepting state |2). The input letters are c and d, with transition matrices 
respectively: 


Lee Lae. 0 1 2 JQ 0 
Mos beg Pee Os Ma Oba tL Ss 
0 0 2 Of b=, ba 


(1) For quantum automata what property must hold for the transition matrices 
of each letter of the alphabet? [1 mark] 


(it) Verify that this property holds for M. and Mg. [4 marks] 


(it) Give a four-letter input string containing two occurrences of c and two 
occurrences of d that is accepted with 100% probability. [2 marks] 


(iv) Give an eight-letter input string containing both c and d that returns to 
the initial state with 100% probability. [2 marks] 
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Randomised Algorithms 


Consider the allocation of n balls into n bins, both labelled [n] := {1,2,...,n}. 
We assume each ball is assigned to a bin chosen uniformly and independently at 
random. 


(a) For any given bin, what is the expectation and variance of its load? [2 marks] 


(b) What is known about the maximum load across all n bins? A proof or 
justification is not required here. [2 marks] 


Assume now that each ball 7 € [n] has a random processing time B;, which has a 
mean one exponential distribution, i.e., for any t > 0, P[B; >t] =e7'. For a bin 


i € [n], let T; be the sum of the processing times of balls allocated to 7. 


(c) Show that E[7;] = 1 for every bin i € [n]. For full marks, your answer should 
include a justification and a formal definition of 7;. 4 marks 


(d) Find a constant c > 0 such that the probability that a fixed ball has processing 
time at least c-logn is at least n~!/?? 2 marks 


(e) Using part (d), argue that with high probability, at least one ball has a processing 
time of at least c- log n. 4 marks 


Cy heh = eee B; be the total processing time of all n balls. Prove a Chernoff 
Bound of the form P[B > (1+6)-E[B]], for any 6 > 0. 
Hints: You may use the fact that for Z being exponentially distributed with 


mean 1, it noe for any 0< A <1 that E [e*4] < =e Also you may want to 


choose \ = ae, 


when optimising the tail bound. [6 marks] 
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13. Types 


Consider the simply-typed lambda calculus with only function types and boolean 
types, with true, false, and if-then-else term formers for the boolean type. 


(a) Define a logical relation suitable for establishing the termination of programs in 


this language. 4 marks 
(b) State the closure property of the logical relation. 2 marks 
(c) Prove closure for the case of the boolean type. 6 marks 
(d) State the fundamental lemma for this language. 2 marks 
(e) Prove the fundamental lemma for the if-then-else case. 6 marks 


END OF PAPER 
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